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Abstract — A decentralized network of cognitive and non- 
cognltlve transmitters where each transmitter alms at maximizing 
his energy-efficiency is considered. The cognitive transmitters are 
assumed to be able to sense the transmit power of their non- 
cognitive counterparts and the former have a cost for sensing. 
The Stackelberg equlhbrlum analysis of this 2 level hierarchical 
game is conducted, which allows us to better understand the 
effects of cognition on energy-efficiency. In particular, it is proven 
that the network energy-efficiency is maximized when only a 
given fraction of terminals are cognitive. Then, we study a sensing 
game where all the transmitters are assumed to take the decision 
whether to sense (namely to be cognitive) or not. This game is 
shown to be a weighted potential game and its set of equilibria 
is studied. Playing the sensing game in a first phase (e.g., of a 
time-slot) and then playing the power control game is shown to 
be more efficient individually for all transmitters than playing a 
game where a transmitter would jointly optimize whether to sense 
and his power level, showing the existence of a kind of Braess 
paradox. The derived results are illustrated by numerical results 
and provide some insights on how to deploy cognitive radios in 
heterogeneous networks in terms of sensing capabilities. 

Index Terms — Power Control, Stackelberg Equilibrium, 
Energy-Efficiency. 



I. Introduction 

In fixed communication networks, the paradigm of peer-to- 
peer communications has known a powerful surge of interest 
during the past two decades with applications such as the 
Internet. Remarkably, this paradigm has also been found to 
be very useful for wireless networks. Wireless ad hoc and 
cognitive networks are two illustrative examples of this. One 
important typical feature of these networks is that the terminals 
have to take some decisions in an autonomous or quasi- 
autonomous manner. Typically, they can choose their power 
control and resource allocation policy. The corresponding 
framework, which is the one of this paper, is the one of 
decentralized or distributed power control (PC) or resource 
allocation. More specifically, the scenario of interest is the 
case of power control over quasi-static channels in cognitive 
networks fT?!. In such a context, which is broader than the 
one of ad hoc and cognitive wireless networks, we assume 
that some (possibly all) transmitters are able to sense the power 
levels of non-cognitive transmitters and adapt their power level 
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accordingly. The considered model of multiuser networks is 
a multiple access channel (MAC) with time-selective non- 
frequency selective links but the methodology can be applied 
to other types of interference networks. Technical issues re- 
lated to spectrum usage is not considered in this paper, leaving 
this aspect as a relevant extension of this paper. Rather, we 
want to study the effect of cognition in terms of energy usage, 
the potential benefits in terms of spectral efficiency having 
been well investigated. The selected performance metric for 
a transmitter is derived from the energy-efficiency definition 
of [16|. The authors of lfT6l define energy-efficiency as the 
number of bits successfully decoded by the receiver per joule 
consumed at the transmitter (in lfT6l the radiated power is 
concerned). More specifically, the authors analyze the problem 
of decentralized power control in flat fading multiple access 
channels. The problem is formulated as a non-cooperative one- 
shot game where the players are the transmitters, the action of 
a given player is her/his/its transmit power ("his" is chosen 
in this paper), and his payoff/reward/utility function is the 
energy-efficiency of his communication with the receiver; we 
will not provide here the motivations for using game theory 
to study distributed power control problems but some of them 
can be found e.g., in ||231 . The results reported in llT6l have 
been extended to the case of multi-carrier systems in fTPl. 

The framework of the present work is close in spirit to 
lfT6ll . lUn but differs from them in several aspects. The 
most important one is that there can be a hierarchy among 
the transmitters in terms of observation capabilities, some 
transmitters can be cognitive and observe the others whereas 
the latter cannot observe the actions played by the former. 
Technically, this leads to a Stackelberg-type formulation of the 
problem |34|. The closest work to the one reported here is ll22ll 
where a Stackelberg model of energy-efficient power control 
problems is introduced for the first time. The present work 
reports a significant extension of the framework introduced in 
|22J. Two games are studied in detail. The power control game 
corresponds to a generalization of the game addressed in [22 j : 
the sensing costs are taken into account (observing/sensing the 
others has a cost) and more importantly, our analysis is not 
limited to one non-cognitive transmitter (i.e., a single game 
leader). Then, we introduce a new game where the transmitters 
decide whether to sense or not. A third game, which is is an 
hybrid control game and include the two mentioned games as 
special cases, is shown to be not worth being studied because 



of the existence of a Braess paradox fSl. 

The paper is organized as follows. In Sec. |ll] the assumed 
signal model to describe the distributed power control problem 
over time-selective non-frequency selective multiple access 
channels is provided. Known results concerning the case where 
the transmitters tune their power levels from block to block in 
a distributed way and without observing the other transmitters 
(i.e., they cannot sense the powers chosen by the others) are 
provided. In Sec. 
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we assume that some transmitters have 
sensing capabilities, which creates a hierarchy in terms of 
observation capabilities between the transmitters. The effect of 
this is that choosing rational power control policies in this set- 
ting leads to a more efficient network outcome (a Stackelberg 
equilibrium), provided that the sensing cost for a cognitive 
transmitter is not too high. While in Sec. 
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a transmitter 



was imposed to sense or not, it is assumed in Sec. IV that this 
choice is left to the transmitter itself. It is shown that there 
exists an optimal number of cognitive transmitters in terms 
of network utility and therefore having too many advanced 
terminals can be detrimental to the global performance. It is 
shown that leaving the choice to a transmitter to choose in 
a joint manner its power level and whether to sense is in 
fact less energy-efficient than imposing that the transmitters 
choose these two quantities separately. This shows the interest 



in studying the power control game (as in Sec. IIIi and the 
sensing game separately. The sensing game is a new game 
we introduce and is shown to possess attractive properties for 
distributed optimization and learning algorithms. Finally, in 
Sec. IV] numerical illustrations are provided and the paper is 
concluded in Sec. IVI] 



zero-mean Gaussian random variable with variance a^ and 
each channel gain h^ varies over time but is assumed to be 
constant over each block ; the symbol index t will be omitted 
in this paper In terms of channel state information (CSI), 
the receiver is assumed to know all the channel gains (global 
CSI) while each transmitter only knows his own channel (local 
CSI). For each block, the expression of the receive signal-to- 
interference-plus-noise ratio (SINR) of user k is given by : 
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where for all j e /C, gj = |/ijp and 9j represents a parameter 
depending on the interference scenario. For example, in a 
random code division multiple access (RCDMA) system with 
single-user decoding, we would have 6j — -^ where N is 
the spreading factor |11|. This is the choice we will do. 
Nonetheless, note that the present work is not restricted to 
code division multiple access (CDMA) systems. Indeed, by 
choosing 9j — 1 (i.e., iV = 1) the above signal model cor- 
responds to the information-theoretic channel model used for 
studying multiple access channels Il35l . ||9J ; in this setup, good 
channel codes are assumed (see e.g., |l6l for more comments on 
the multiple access technique involved). Indeed, what matters 
the most in the model is that it captures the different aspects 
of the problem (especially the SINR structure). At last, the 
case where successive interference cancelation is used at the 
receiver {9j G {0, 1}, depending on the decoding order) is left 
as an extension of the present work. 



II. Problem statement 

A. System model 

We consider a decentraUzed multiple access channel with 
a finite number of transmitters, which is denoted by K. The 
network is said to be decentralized in the sense that the receiver 
(e.g., a base/mobile station) does not dictate to the transmitters 
(e.g., mobile/base stations) their power control policy. Rather, 
all the transmitters choose their policy by themselves and want 
to selfishly maximize their energy-efficiency. In particular, 
they can ignore some specified centraUzed poUcies. We assume 
that the users transmit their data over time-selective non- 
frequency selective channels, at the same time and on the 
same frequency band; channels are considered to be constant 
over each block of data. Note that a block is defined as a se- 
quence of M consecutive symbols which comprises a training 
sequence that is, a certain number of consecutive symbols used 
to estimate the channel (or other related quantities) associated 
with a given block. A block has therefore a duration less than 
the channel coherence time. The equivalent baseband signal at 
the receiver can be written as 



K 
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where k £ K,, K, — {l,...,K}, Xk{t) represents the symbol 
transmitted by transmitter k at time i G N, E|a;/jp = pi, 
the noise z is assumed to be decentraUzed according to a 



B. Performance metric 

Assuming the above signal model, we assume that each 
transmitter wants to selfishly maximize the energy-efficiency 
of his communication with the receiver. The used performance 
metric is the one originally proposed in [,16J. For a given block, 
transmitter k wants to maximize the following quantity : 



vt,{pk,v-k) = ^^^^^ [bit/J] 
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where i?j. is the transmission rate (in bit/s), f is an effi- 
ciency function (representing the block success rate), and the 
subscript —k on vector p stands for "all the transmitters except 
transmitter fc", i.e., p_fc = {pi, . . . ,pk-i,Pk+i, ■ ■ ■ ,Pk)- 
Note that, as a standard assumption, Rk is assumed to be 
independent of 7^. or p^ which may correspond in practice to 
a given choice of modulation coding scheme. As motivated in 
II3TI . im, the efficiency function is assumed to be an increasing 
and sigmoidal (or S-shaped) function verifying < f(.) < 1 
with f(0) = and lim f(a;) — 1. The fact that f is sigmoidal 

a;— v+00 

has at least two important consequences : the utility function 
Vk is quasi-concave w.r.t. pk and the derivative of Vk vanishes 
at only one point which is different from 0. We see that 
Rk might be chosen to be SINR-dependent without affecting 
the problem analysis provided that the product i?fef be a 
sigmoidal function. For the sake of clarity, we assume that the 
players have the same efficiency function f. In lfT6l and related 
works, Pk represents the power radiated by the transmitter. 



Interestingly, the above utility can also model situations where 
the power consumed by the whole transmitting device has to 
be accounted for. Indeed, by replacing the denominator of wj. 
by apk + b, (a, b) being a pair of non-negative constants, one 
obtains a first-order model of the device power consumption 
which includes both the consumption part which does not 
depend on the radiated power and the one due to the transmit 
power |13|. This does not change significantly the mathemat- 
ical analysis of the power control problem. In order to focus 
our attention on the most important points of our analysis and 
make the exposition as clear as possible, the original model 
of |16| has been selected (i.e., {a,b) = (1,0)). 

C. Game-theoretic modeling : review of the non-cooperative 
game of l[16]l 

An appropriate model for the power control problem de- 
scribed above is given by a strategic form game |16|. A 
strategic form game consists of an ordered triplet comprising 
the set of players, their action or strategy sets, and their 
preference orders (or their utilities when they exist, which is 
the case here). The set of players is the set of transmitters 
/C, the action set is Vk = [0,P^''% k e K., and the 
utility functions are defined by p). This describes the model 
introduced by Goodman et al in |16|. As |16| and related 
references such as (TT], the power levels are chosen to be 
continuous. This allows us to conduct a complete comparison 
analysis in terms of performance. However, this assumption 
is not always suited and the case of discrete power levels 
is therefore left as a complementary way of tackling the 
problems under investigation. An important solution concept 
for this game is the Nash equilibriurqj (NE) |28|, which is a 
power profile/vector that is robust against unilateral deviations 
(no player has interest in deviating if the others keep the 
equilibrium strategy). The unique NE of this power control 
game is : 
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where f3* denotes the best SINR choice for user k at the 
NE ; as explained in l,16Jl.irj . a necessary condition for this 
equilibrium to be defined is that the system load is not too high 
{^^-^(3* < 1). Note that the equilibrium SINR is common to 
all users. It is easy to verify that (3* is the positive solution of 
the differential equation xf (x) — f(x) = 0, which is obtained 
by solving max -^, i.e., an equivalent problem of max wj, 

(following from the assumption that Rk is independent of jk 
and Pk). 

The equilibrium solution holds if the power constraint pk < 
pmax jg satisfied, which is what we will assume throughout 
this paper (see e.g., 1161 for further details about the case 
where the constraint is active). This game model, although 
leading to a decentralized solution in terms of decisions and 
CSI (see 1161 ). has one main drawback : the equilibrium 
solution can be inefficient. Interestingly, introducing some 
hierarchy between the players in terms of observation can 

'App.lAJreviews several game-theoretic notions. Note that the unconditional 
existence of a pure NE is, in part, a consequence of quasi-concavity for the 
utility functions. 



improve the game outcome, as shown in ll22l . It turns out 
that hierarchy is naturally present in networks where some 
transmitters are equipped with a cognitive radio while the 
others are not. This is one of our motivations for formulating 
the problem in decentralized cognitive networks as a two- 
level Stackelberg game, with arbitrary numbers of cognitive 
radios, generalizing the 1-leader K— 1-follower game of lfT6l . 
Compared to the latter reference [16l, a second interesting 
feature of the game described below is that the cost induced by 
sensing is accounted for in the utility function of the cognitive 
transmitters. The proposed approach may be relevant in most 
applications where cognitive radio is useful. Indeed, one of the 
messages of this work is that if the fraction of transmitters who 
can observe their environment is too high, this may degrade 
the global performance. To mention an existing scenario where 
this type of approaches might be applied in the future, the 
case of WiFi systems can be mentioned. In France, operators 
provides more and more advanced access points. Typically, 
they want to optimize channel selection (which is a special 
case of power allocation) in a more and more efficient manner 
Assuming that some access points (AP) are optimized accord- 
ing to a Nash strategies while others implement Stackelberg 
strategies allows one to provide a simplified model to account 
for the fact that advanced APs coexist with less advanced APs. 
Interestingly, as shown in this paper, as far as power control 
is concerned, having too many advanced APs might not be as 
good as the common sense would indicate. 



III. The two-level power control game with 

SENSING COSTS 

The set of transmitters /C = {1,2, ...,K} comprises F 
terminals equipped with a cognitive radio while the L = K—F 
other terminals have no sensing capabilities. The pair {F, L) 
is assumed to be fixed throughout the whole section; it will 
be optimized in a centralized (resp. decentralized) manner 
in Sec. IV-A (resp. Sec. 



IV-B I. Without loss of generality. 



the set of non-cognitive (resp. cognitive) terminals will be 
C = {1,2,...,L} (resp. J" = {L + 1, L + 2, ..., if}). This two- 
level hierarchical game is played as follows. For each block, 
the non-cognitive transmitters (called the leaders) choose their 
power level rationally knowing that their decisions are going 
to be observed by the cognitive transmitters. The cognitive 
transmitters (called the followers) react to these decisions 
rationally. A choice is said to be rational in the sense that 
the transmitter maximizes his utility. To this end, we denote 
by P£ = (pi, • ■ ■ ,Pl) and p^ ^ {pl+i, ■ ■ ■ ,Pk) the vectors 
of actions (transmit powers) of the leaders and followers, 
respectively. Also denote by U*{pc) the set of NE for the 
group of followers when the leaders play p£. The resulting 
outcome of this interaction is a Stackelberg equilibrium (SE), 
which is defined as follows. 

Definition 3.1 (Stackelberg equilibrium): A vector of ac- 
tions p^^ — (p£^,p^) is called a Stackelberg equilibrium, 
if p^ G U*{p^^) and the actions p^^ is a Nash equilibrium 



for the leaders^ 

By looking at the mathematical expression of the Stack- 
elberg equilibrium defined above, we can see that if the NE 
exists, then SE also exists. But the SE is not included in the 
set of NE of a non-cooperative game. There exists several 
examples like the Coumot game in which the action chosen 
by the leader at the SE is different compared to the action 
chosen at the SE |15|. As the best-response of each player 
is a scalar- valued function (see |22|), the determination of 
the Stackelberg equilibria of the game amounts to solving the 
following bi-level optimization problem: 



Following the standard methodology of equilibrium analysis 
(see e.g., f^TlfSSl), three important issues to be dealt with are 
the existence, uniqueness, and efficiency issues for the Stack- 
elberg equilibrium. The next theorem provides an element of 
response to the first two issues. 

Proposition 3.1 ([22]): There always exists a Stackelberg 
equilibrium p^^ in the two-level hierarchical game with L > \ 
leaders and F > 1 followers. The power profile defined by 



€ C, pf = 



^7i(A^ + r 



Pe e argmaxuf {pe,-p-t,PL+\{pt,V>-i), ■ ■ ■ ,PK{vt,V>-i)) , V^ G 
Pi 

(5) 

where for all p£, 

pfiPc) = argmaxu,(p£,pf+i(p£),...,pf_i(p£),p/,..., 
pf+i(P£),...,pf(P£)), V/G.F (6) 
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7V2 - N(F - 1)13* - [(N + I3*)(L - 1) + Fp*] Yl 



where the utility functions are given by : 



Wfc(p) 



Wfc(p) 
(l-afc)wfc(p) 



if fc e £, 
if fc e J". 



(7) 



The parameter a^ S [0, 1], fc e J^ is a constant w.rt. 
the power levels which accounts for the sensing cost (to be 
illustrative, we wiU choose ak = a in some places). This 
constant has no effect on the equilibrium strategies. However, 
when it will come to knowing whether being a follower or 
not, this constant will play a role. To elaborate further on 
this constant, it can be interpreted as the fraction of time a 
cognitive user k E F spends for sensing. In order to have 
a good sensing capabilitie^ we assume that there exists a 
certain energy threshold i^min (see e.g., ifTSJI ) expressed in 
joule ; 

akT mm {gMPi) > Cmin 

where T is the block duration in second and pg in Watt whereas 
ttfe and /(.) are unitless ; g^e is the channel gain between any 
leader £ G £ and the considered follower fc e J^. If the above 
inequality holds, it means that the cognitive user k G F is 
able to sense the presence of the primary ones. Apparently, 
we assume that the sensing constraint is feasible in the sense 
that there exists a minimum fraction a^ < 1, k £ F above 
which the minimum energy threshold for sensing is attained. 
A necessary condition for this is that T min (gk£Pe) > 6nin- 
For the sake of clarity, we suppose that the sensing cost is 
the same for every player ak — a, k E F. At this point, 
the two-level hierarchical power control game is completely 
defined : the players are the L non-cognitive transmitters and 
the F cognitive transmitters, their action sets are [0, Pj^^^], 
and their utilities are defined by (jTli. 

-We assume, w.l.o.g. two players in a non-cooperative game witli one leader 
and one follower. If the leader plays the NE action, then, as the follower 
observes this action and plays the best-response against it, the follower will 
play the NE strategy. Then, the NE strategy profile, if it exists, can be a SE. 
The behavior is the same when there are several leaders and followers. If the 
NE between the leaders coiresponds to the NE of the game between leaders 
and followers, then the followers respond by playing the NE. 

^^Under the assumption of single-user decoding (each useful signal is 
detected by considering the other signals as noise), a good sensing capability 
means that a follower can detect the existence of all the leaders. In particular, 
the leader whose link with the follower is the worst is detected. 



V/G.F, pf = 
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is an SE and /3* is the positive root of xf'{x) — f{x), and "fj^ 
is the positive root of x(l — eLx)f'{x) — f(x), with 
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Moreover, the equilibrium p^^ is unique if the following two 
conditions hold: (1) lim yrr^ > Se^, and (2) equation x{l — 
eLx)f'{x) — f(x) — has a single root in (0,/?*). 

Note that the existence of such an equilibrium is ensured 
from the properties of the Stackelberg game, especially the 
sigmoidness of f [|22l . In ll22l . it also explained that the best- 
response of the players are scalar-valued functions, which fa- 
cilitates the Stackelberg equilibrium analysis. Interestingly, the 
provided sufficient conditions for uniqueness can be checked 
to be satisfied for two typical efficiency functions used in the 
related literature namely, f(x) = (1 — e^)^^ and f(a;) — e^^ , 
c > used in |T6l and f4l respectively. 

At last but not least, we address the issue of efficiency for 
the derived Stackelberg equilibrium. The key point at stake 
is whether cognition helps to obtain a better decentralized 
network in terms of global energy-efficiency. For this purpose, 
we first compare the utility a player would get in a system 
where no cognitive transmitters exist with the one he would 
obtain in a system with cognitive transmitters (i.e., at the 
Nash equilibrium corresponding to ifT^ ). Our main results are 
summarized by the following proposition. 

Proposition 3.2 (SE versus NE): The utility at the SE of the 
two-level hierarchical game with sensing cost of any leader is 
always greater or equal to the one obtained at the NE. 

If the cost for sensing is negligible, the next corollary 
follows. 

Corollary 3.3 (SE versus NE with no sensing cost): The 
power profile at the SE Pareto dominates the power profile at 
the NE. 



The proof of Proposition 3.2 and corollary 3.3 are given in 



112211 . Another relevant question, initially raised in a context 
with a single leader and no sensing cost | |22| is whether 
it is better to follow or lead the power control game. Said 
otherwise, is it beneficial for a transmitter to be equipped with 
a cognitive radio when sensing costs are accounted for ? The 
answer is provided below. 

Proposition 3.4 (Following versus leading): At the Stackel- 
berg equilibrium of the two-level power control game with 
sensing cost, a transmitter prefers to be a follower (that is 



to say, to sense) if the minimum energy threshold for sensing 
verifies : 

f(7D 

1- 



< 



■^ (^+7l) 



m {N+p-") 



T ijnngfp, 



SE 



(9) 



The proof of this result is given in App. |B] Interestingly, it 
is possible to provide an explicit lower bound on the energy 
threshold for a cognitive radio for being energy-efficient. For a 
transmitter, this bound mathematically translates the tradeoff 
between the benefit (in terms of energy-efficiency) of being 
informed about the actions played by the others and the cost 
induced by acquiring this knowledge. If the sensing cost is 
negligible, then following becomes always better than leading, 
giving an incentive to equip a transmitter with a cognitive 
radio. However, if all the transmitters of the network are 
cognitive, the network energy-efficiency is not maximized, 
which is what is proved in the next section. 
IV. The sensing game 

In the preceding section, the pair {F, L) and the identities 
of cognitive and non-cognitive transmitters were fixed. A quite 
natural question is to ask whether the transmitters would 
effectively sense or not in a fully decentralized network where 
the decision to sense is left to them. Providing answers to this 
question is the purpose of this section. As a first step, we show 
the existence of an optimal number of cognitive transmitters in 
terms of social welfare (sum utility). The corresponding upper 
bound can be used to assess the price of anarchy of the network 
[|T9 1 and therefore measuring the cost of having this decision 
decentralized. As a second step, we consider a sensing game 
in which each transmitter has two actions (sense/not sense). 
It is shown that each transmitter can learn his best decision 
provided that the number of blocks is sufficiently large. 
A. On the optimal number of cognitive transmitters 

The global energy-efficiency of the network is measured in 
terms of social welfare UJ or sum utility at the equilibrium 
which is defined by : 



K 

fc=l 


L K 

fc=l k=L+l 



Uk, 



(10) 



where e G {SE, NE}. Note that a Nash equilibrium is obtained 
when L = K, indeed in this context, there is no followers 
and the game is no more hierarchical. The subscript L has 
been added to the equilibrium profiles to clearly indicate that 
it is related to the number of leaders whenever L ^ K (see 



equation (lOi. As the parameter L, which is the number of 
leaders or non-cognitive transmitters, belongs to a discrete set 
/C, the function wl '■ K, -^ M+ has necessarily a maximurrn 
Is this maximum reached at the non-trivial points L = 1 or 
L = Kl >From Proposition 3.2 we know that it is not the 
case. Indeed, if the sensing cost is small enough, then the 
power profile at any SE Pareto-dominates the one of the NE 
obtained with L = 1 or L = K. Thus, the latter points are 
not the maximizer candidates for the sum utility. However, 

"^Note that, however, we do not try to optimize the identity of the followers 
or leaders with respect to their channel quality. This type of issues, which is 
of relevant in centralized scenarios, is addressed in (22 1 in a special case. 



when the sensing cost is arbitrary, answering the question 
analytically does not seem to be trivial. This is why we solve 
this maximization problem numerically in Sec. W\ To still get 
some insights into the problem, we study a very special case 
of it. The interest in doing so is that it clearly shows that the 
sum utility maximizer is non-trivial and a little more about the 
connection with the network load can be learned. 

We now consider the special case defined by the following 
four assumptions : 

• Assumption 1 : gk = g,Rk = R for all k E K,. 
. Assumption 2 : N » (K - l)P*. 

• Assumption 3 : f(a;) ~ e^'^ , c> 0. 

• Assumption 4: 1 < L < K - 1. 
The social welfare at the SE is given by : 
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Note that in fSl, c ~ 2^ — 1, where r is the spectral efficiency of 
the used channel coding scheme (in bit/s per Hz). A possible 
choice is r = ^ where R is the data transmission rate and B 
the required bandwidth to transmit. While Assumptions 2 and 
3 are reasonable (they respectively correspond to a small sys- 
tem load and the case where the efficiency function is derived 
from the outage probability on the mutual information f5l), 
the first assumption is very strong and may happen in practice 
in specific scenarios e.g., in virtual multiple input multiple 
output (MIMO) networks with clusters of transmitters ifTTJI 
with similar flow types (a voice service typically). Indeed, if 
fast power control is considered namely, gk represents the fast 
fading, this symmetry assumption will almost never be verified 
in practice. Now, if slow power control is considered, gk may 
represent shadowing and path loss effects, and the g'^s will 
be almost equal for all users being in a given neighborhood 
(as explained in ifTTl where the notion of clusters of users is 
shown to be relevant). In any case, note that the symmetry 
assumption is very local and is only exploited to reinforce the 
existence of a non-trivial optimal number of followers/leaders 
and provide insights on this number. Again, the goal is not to 
claim a general expression for the optimal number of leaders. 
Rather, we just want to derive it in a special case to better 
understand the general optimization problem. Assumption 4 
is not restrictive since the cases K — and i^ = L are ready. 
Under Assumption 2, we have that ol ~ N. From (IHll we 
have El ~ 0, which implies that jj^ sa /?*,f(7|^) sa f(/3'^), and 

N + P* ^ N + -fl' ^ ' 

This allows one to approximate the social welfare at the 



equilibrium (111 as 



~SE 



R 






L 



K - L 



(14) 



Note that the term 



gK0*)N 



(N+0*)a^ ^^ independent of L, and the 
optimal solution L* can be approximated by : 



L* 



arg max 



L K - 



(15) 



The main point in the above equation is that L* is related to 
7^ which is the positive root of the equation a;(l — eLa;)f (x) — 
f(a:) = 0. It turns out that, under Assumption 3, a very simple 
expression for 7|^ can be obtained. One can easily check that : 



iL 



(16) 



l + ELC 

Using the above, the optimization problem given by equation 



([TSj boils down to : 



L* = arg max 



L 



K - L 



l + CLC 



= arg max {^lL) . 



Replacing the discrete variable L with a non-negative real A, 
the optimal solution of the corresponding concave function 
can be checked to be : 



A* 



(1 



N\ 



1- 

K 



'l- 


_K c \ 


if 


K 




if 



K-l 

N 
K-1 

N 



< 



> 



(17) 

where k > is arbitrary small (this constraint is added to 
meet the constraint L < K — 1). Therefore, the optimal 
number of non-cognitive transmitters can be approximated by 



L* = 



A* or L* = 



A* 



depending on which number gives 



the maximal social welfare. From this expression of L*, some 
interesting insights can be easily extracted. First of all, note 
that if the load is sufficiently small compared to 4^ but at the 
same time greater than -, the social welfare is maximized for 
L* = K — 1. Indeed, when the load is small, the interaction 
between players is not strong and the impact of a hierarchy 
on the overall system is small (at least for a reasonably large 
system). Thus, the social welfare is maximized at the NE 
point, which corresponds to the case where all users are leader 
and play at the same time. A second type of insights can be 
obtained by considering the spectral efficiency related to the 
channel coding namely c = 2'' — 1. As already mentioned, 
note that it is assumed that the spectral efficiency involved by 
the multiple access technique is sufficiently small (Assumption 
2). If we go further by assuming that both the multiple access 
technique spectral efficiency (^ — > 0) and channel coding 
spectral efficiency (c — > 0) are small, we obtain that : 



A* 



(1 



N\ IK 
c ) 2 N' 



K 
2 



(18) 



which means that in the low spectral efficiency regime, the 
global energy-efficiency of the network is maximized when 
half of the transmitters are cognitive. If c is large but the 
load is still small (meeting the constraint ^^^ < ^) the 
same conclusion is obtained. Conducting a deep analysis 
to discuss the connections between spectral efficiency and 
energy-efficiency in decentralized cognitive networks in the 
general case (arbitrary load, with cost of sensing) seems to be 
a relevant and non-trivial extension of the simplified analysis 
provided here. 



B. Sensing game : description and key property 

In the two-level hierarchical power control described in 



Sec. Ill the transmitter is, by construction, either a cogni- 
tive transmitter or a non-cognitive one and the action of a 
player consists in choosing his transmit power level. In fully 
decentralized networks, it is legitimate to ask about what a 
transmitter would decide between sensing (being cognitive or 
not) or not sensing. To analyze such a problem, we assume that 
two games occur sequentially for each block (this separation 
assumption will be fully justified in Sec. |IV-C[ ). First, the 
transmitters decide to sense (S) or not to sense (NS). Then, 
their choose their power level based on their status (follower 
or leader) and therefore play the two-level power control game 
described in Sec. 



Ill 



The sensing game can therefore be 
described by a static game whose representation is given by 
the following triplet : 



G = (/C, {Ak)keK, {Uk)ke!c) 



(19) 



where the actions sets are Ak ^ A = {S, NS} and the utility 
functions are those obtained at the Stackelberg equilibrium 
of the power control game played in the second phase. If 
transmitter fc is a follower (i.e., he senses) and that there were 
F followers during the sensing phase of the block then his 
utility is : 



f/fc(S,s 



(F,L)^ 



(1 — ak)gkRk 



Kn 



Np^iN + jl+i) 
{N^^N/S* - [{N+nL+ {F + 1)(3*] 7i+i} 



where the notation s 



(FF) 



means that there are F—1 followers 



and L leaders in the set of players IC\k. On the other hand, 
if transmitter k chooses the action NS he obtains : 



Uk{NS,siY^) 



QkRk 



f(7£ 



a2 iV72+,(7V + /3*) 
{N^-NP* - [{N+P*)L+ {F + 1)P*] 7I+1} 



in terms of utility. The considered sensing game is a con- 
gestion game |30| (and therefore a potential game) under 
strong conditions but is always a weighted potential game. 
The latter property is known to be very useful for studying 
existence of pure NE and convergence of learning algorithms 
or distributed iterative algorithms towards NE. For instance, in 
ESJI . Il26l . Monderer and Shapley proved that every weighted 
potential game has the Fictitious Plajrl Property (FPP). This 
guarantee that every learning algorithm that is Fictitious Play 
process converges in belief to equilibrium. All of this is the 
purpose of the remaining of this section. For making this paper 
sufficiently self-containing we review several useful definitions 
concerning potential games |25|. 

Definition 4.1 (Monderer and Shapley 1996 [251): The 
strategic form game Q is a potential game if there is a 
potential function V : A — > M such that 

Uk{sk,s-k)- Uk{sk,s-k) = V{sk,s^k) - V(s'j.,s_fc), 

Vfc e /C, Sk,s'f. e Ak- 

'The FP learning algorithm can be found in the quoted references or 1361 
but essentially it consists in assuming that each player observes the actions of 
the others and maximizes his average utility based on the empirical frequencies 
of use of actions of the others. 



Theorem 4.2: The sensing game Q = 

(/C, {A)keKj {Uk)k(£K) is on exact potential game if and only 
if one of the two following conditions is satisfied : 

yi,j£K: Rrg, = R,g,, (20) 

Ul{F + 1,L + 1)-Ul{F,L + 2) = {l-a){UF{F + 2,L)- 

Uf{F + 1,L + 1)), (21) 

~J7,(F+1,L+1) 



where Ul{F + 1, L + 1) is defined by '^'''j?^^'^"^^'' when 
player i d ICis one of the F+l followers and Uf{F+1, L+l) 

is defined by - — '^„ ' when player i £ JC one of the 

L + l leaders. 



Condition (20i is a (strong) symmetry condition and is 
obtained under Assumption 1 (Sec. IV-Ai, which would be 
reasonable for a cluster of transmitters in a virtual MIMO 
network with a common service (e.g., voice). In fact, it is 
more realistic not to make these assumptions and claim for a 
potential property which is sufficient for key issues such as 
convergence of some important learning dynamics. 

Definition 4.3 (Monderer and Shapley 1996 4251/ )." The 
strategic form game Q is a weighted potential game if there 
is a vector {pi)i^K: '^'^d ^ potential function V : A — > M 
such that : 

Vi e IC, {si,s'i) G A^, 
t/i(Si,S_i) - C/,(s-,s_,) = ^i(y"(s,,s_i) - l/(s-,s_i)). 

It turns out that such a vector can be found. 

Theorem 4.4: The sensing game Q ~ 

(/C, {Ai)i(zic^ (Ui)i£ic) is o weighted potential game with the 
weight vector : 



Vi e /C, fM 



(22) 



The proof is given in App. IFl 

C. Equilibrium analysis 

1) Existence: First of all, note that since the sensing game 
is finite (i.e., both the number of players and the sets of 
actions are finite), the existence of at least one mixed NE is 
guaranteed f29\. Now, since the game is a weighted potential 
game, the existence of at least one pure NE is guaranteed 
||25 |. We might restrict our attention to pure and mixed Nash 
equilibria. However, as it will be clearly seen in the 2-player 



case study (Sec. IV-C3 i, this may pose a problem of fairness 
and efficiency. This is the main reason why we also study 
the set of correlated equilibria (App. |A-A| i of the sensing 
game. The concept of correlated equilibrium |2| allows one 
to enlarge the set of equilibrium utilities. Every utility vector 
inside the convex hull of the Nash equilibrium utilities is 
a correlated equilibrium, which guarantees the existence of 
correlated equilibria in general. 

2) Uniqueness: Here, we provide a brief analysis of unique- 
ness for the pure NE. This matters since pure NE are attractors 
of important dynamics such as the replicator dynamics (which 
corresponds to the limit of important learning schemes) I?]. 
One obvious advantage of having uniqueness of the game 
outcome is to make the game predictable, which may be useful 
from a designer standpoint. As mentioned above, by contrast, 
the number of correlated equilibria is generally greater than 
one and more typically infinite. The following proposition 



provides sufficient condition under which the sensing game 
(always with costs) has a unique pure NE. 

Proposition 4.5: Assume the following two conditions are 
satisfied : 

N^-Np* - [{N+p*){K - 1)+ 2/3*] 7|f_i 



> 1- 



(23) 



> 1- 



(24) 



Then the unique Nash equilibrium of the game is 
(5^sS,...,s|,) = (NS,NS,...,NS). 

Condition ( |23| ) insures that the non-sensing strategy NS 
dominates the sensing strategy S when none of the other 
player sense. Condition (24i insures that the non-sensing 



strategy NS dominates the sensing strategy S when some of 
the other player sense. Both conditions together imply that 
the sensing strategy S is always a dominated strategy for 
each player. The unique Nash equilibrium of the game is 
(s^s^,...,s^) = (NS,NS,...,NS). 

3) Efficiency: In a decentralized network, since no or little 
coordination between terminals is available, an important issue 
is the efficiency of the network at the equilibrium state. Are 
the mixed or pure NE of the sensing game efficient in terms of 
utility? To be illustrative and to understand in a deep manner 
the problem under investigation, our choice, in this section, 
is to mainly focus on the 2— transmitter case but most of the 
provided results can be extended to the general case K > 2. 

Theorem 4.6: [Number of NE] The matrix game has the 
following NE : 

• a unique NE if and only if (CI) : a > ^ ~^ 

13'- 



three NE if and only if (C2) : a < 
an infinite number of NE if (C3) : 



1-/3*7* ■ 

-7 . 



-,3*7*' 
_ ,3*- 



■7 



1-/9*7* 



The proof of this result is provided in App. IC] There is 
also a strictly mixed equilibrium which can be round using 
the indifference principle. Let {x, 1 — a;) the mixed strategy 
for player 1 andfy, l — y) the mixed strategy for player 2. As 
proven in App.[D] there is a unique pair {x* ,y*) satisfying the 
indifference principle. The corresponding distribution is given 
by: 



X =y ^ 



(l-a)^(l-/3') 
X 



f(T*) l-j-13- 
7* 1+/3* 



(25) 



utilities are : 



0^ 7+ * and the corresponding equilibrium 



Ui{x*,y* 
U2{x*,y* 



Ri9i 
R292 



with 



(1- 


afPii nV^ p*) 


f(7* 

7* 


X 

l-7*/3Vl ^^f(/3*) l-7*/3* 
1+P' y^ ^) /3* 1+7* 



X 



Fig. [T] represents the three equilibrium utility points for 
a typical scenario. The shaded area represents the region of 



feasible utilities for a given scenario (described under Fig. [T]l. 
Operating at one of the pure NE can be unfair for one of 
the transmitters and therefore inefficient for a certain fairness 
criterion IIT2I . Operating at the mixed NE is clearly suboptimal 
since it is Pareto-dominated by some feasible pairs of utilities. 
A way of dealing with fairness or/and Pareto-inefficiencies is 
to induce correlated equilibria (CE) in the game. 

In practice, having a correlated equilibrium means that the 
players have no interest in ignoring (public or private) signals 
which would recommend them to play according to certain 
joint distribution over the action profiles of the game. In 
wireless networks, a correlated equilibrium can be induced by 
a common signalling from a source which is exogenous to the 
game. It may be a signal generated by the receiver itself but 
also an FM (frequency modulation) signal, or a GPS (global 
positioning system) signal, meaning that the additional cost for 
adding this signal may be zero if the terminals are already able 
to decode such a signal. At last, note that such a coordination 
mechanism is scalable in the sense that it can accommodate a 
high number of transmitters; in practice, physical limitations 
may arise e.g., if the signal is sampled into a finite number of 
bits. If a > 1 17 , , as there is only one NE, the convex hull 
of NE boils down to a point and there does not exist any other 
correlated equilibrium other than this NE. Rather, we assume 
that the sensing cost verifies condition (CI) which is the case 
of interest since several NE exist (see Theorem |4.6| l. In this 
case the following result holds. 

Theorem 4. 7: Any convex combination of NE is a CE. In 
particular, if there exists a utility vector v = (iyi,i'2) ond a 
parameter A € [0,1] such that : 



ui = At/i(Si,NS2) 

1^2 - At/2(Si,NS2) 



(l-A)[/i(NSi,S2) (26) 
(l-A)[/2(NSi,S2), (27) 



then J/ is a correlated equilibrium. 

Clearly, a signal recommending the transmitters to play the 
action profile (Si,NS2) (resp. (NSi,S2)) for a fraction of 
the time equals to A (resp. to 1 — A) induces a correlated 
equilibrium. This specific signalling structure leads to the set 
of equilibria represented by the bold segment in Fig. [T] The 
figure illustrates the potential gains which can be obtained by 
implementing a simple coordination mechanism in the sensing 
game with costs. 

We would like to end this section dedicated to the efficiency 
of the equilibria of the game by mentioning the potential sub- 
optimality induced by playing the sensing game and power 
control game separately (in two consecutive phases). Indeed, 
it would be legitimate to ask about what would happen if a 
transmitter were deciding jointly whether to sense or not and 
his power level. In such a case the action set of a transmitter 
would be : 



A = {Sfe,NSfe}x[0,/^"n. 



(28) 



An action a^, = {sk,Pk) has therefore two components. The 
first component is discrete whereas the second component is 
continuous. This framework is referred to an hybrid control in 
control theory ifTOl . EtII . While the control theory literature 
is rich concerning hybrid control, this is not the case for 



hybrid control games. In particular, general existence theorems 
for Nash equilibria seem to be unavailable. This is one of 
the reasons we will only consider the special case of two 
transmitters. In the 2— player hybrid control game it can be 
easily seen that the two pure NE of the sensing game are 
no longer equilibria in this new game. Instead, we have the 
following result. 

Proposition 4.8: The unique Nash equilibrium of the 
2— player hybrid control game is given by : 



(a*,a;) = (NSi 



„NE 
Pi ' 



NS2,pr) 



(29) 



where p^^ is given by p|. 

This result immediately follows from the fact that action 
every action under the form {S,pk) is dominated by the 
action (NS,pa;). Although the proof of this result is trivial, the 
interpretation is nonetheless interesting. It shows the existence 
of a Braess paradox in the hybrid control game : although the 
players have more options in the hybrid game, the equilibrium 
utilities are less than those obtained in the separated case 
where they first decide to sense or not and then adapt their 
power level. In additional to implementation considerations, 
this gives us another reason to perform the decision process 
in two consecutive phases. 

V. Numerical results 

In this section, numerical results are provided to validate 
our theoretical claims. Note that, although simple scenarios 
considered, the authors believe that most of messages and 
insights conveyed by the present numerical analysis hold 
in more advanced simulation setups e.g., considering stan- 
dardized channel modulation and coding schemes (MCS), 
real frequency selective channel impulse responses, imperfect 
channel state information, and sensing techniques accounting 
for estimation noise. Indeed, as explained in LI 6.1 . the choice 
of a specific MCS will generally lead to a packet success 
rate having the assumed properties. As shown in [11 1, the 
case of frequency selective channels is treatable once the 
frequency flat case has been treated. Therefore, only the impact 
of channel estimation noise seems to be more uncertain and 
would call for a more challenging extension of the results 
provided here. We consider a random CDMA scenario with 
a spreading factor equal to N and the efficiency function is 
chosen to be f{x) = e ^ with different parameters r [4i. 
We consider two scenarios. The first scenario is provided in 
Fig. [T] This scenario provides a clear understanding of the 
variety of equilibria in the sensing game. The pure, mixed, 
and correlated equilibria are represented on the utility region. 
The utility region of the sensing game with two players K — 2, 
no spreading A^ = 1, the sensing cost a = 20%, the sigmoidal 
function f{x) = e ^^ with r = 0.9 and the following 
parameters : Ri gi = 2, R2 52 = 2.5, cr^ = 1. The pure 
actions lead to the utilities marked by circles, the dark green 
region corresponds to the pair of utilities that is achievable 
with mixed actions whereas the light green region corresponds 
to the utilities that are achievable only with correlated actions. 
The two pure equilibrium utilities, denoted by +, correspond 
to both upper left extremal pure utilities, the completely mixed 



equilibrium utility, denoted by x, is located in the interior 
of the dark green region. The blue line between the two 
pure equilibria represents a sub-set of correlated equilibrium 
utilities that corresponds to the Pareto-optimal frontier. The 
Nash bargaining solution, denoted by *, corresponds to the 
intersection of the hyperbolic curve with the set of correlated 
equilibrium utilities. It provides a fair and optimal equilibrium 
solution for the sensing game. 

The second scenario considers a sensing game with 17 players, 
A^ = 128 sub-carriers, the sensing cost a = 5%, the sigmoidal 
function f{x) ~ e ^ . For simplicity, we assume an homo- 
geneous scenario in terms of transmission rate R^ = B and 
r = ^3 bit/s per Hz for different numbers of leaders. Note that 
the value for R will not matter since only normalized/relative 
performance gains will be considered. The seemingly non- 
typical choice for K results from typical choices on the 
other parameters. Indeed, when fixing the spreading factor 
to A^ = 128 (typical e.g., in cellular systems), the spectral 
efficiency to r = 3 bit/s per Hz (typical in cellular systems 
as well), one finds that the maximum number of admissible 
users for the Nash equilibrium to be implemented is 18 (see the 
denominator of (4)) :r = 3^c = 7^/3*=7^ ^^ < 
^ ^ K < 18 where f{x) — e^S, c = 2^ — 1, /3* is the unique 
solution of xf'{x) — f{x) — 0. Fig. [2] and [3] allows one to 
evaluate the improvement brought by the Stackelberg approach 
compared to the Nash equilibrium approach for the utility of 
one leader, one follower and the social utility. The sensing 
cost influences the results in two ways. First, the sensing cost 
affects the gain obtained by the follower compared to the 
leader. Indeed, in this figure, the improvement of a leader is 
always larger than the improvement of one follower, which 
is not true in general. Second, the improvement of the utility 
of one follower compared to the Nash equilibrium utility is 
negative when the number of leader is strictly more than 
14. In that case, the sensing cost is compensated by the 
improvement due to the Stackelberg approach compared to the 
Nash equilibrium approach. The optimal number of leaders is 
5 when considering both the improvement of the utility of one 
leader or the improvement of the utility of one follower. The 
improvement of the sum utility of the Stackelberg equilibrium 
compared to the social utility at the Nash equilibrium for the 
sensing game is given in figure |3] The sensing cost decreases 
the improvement of the social utility. This is especially the case 
when the number of follower is large and the number of leader 
is small. The Stackelberg approach provides up to 16.5% of 
improvement compared to the Nash equilibrium approach. 
Fig.|4]illustrates how much the total power consumption can be 
reduced for the sensing game with K ~ 17 players, iV = 128 

2^ — 1 

sub-carriers, the sigmoidal function f{x) = e ^ with 
r = 3 for different numbers of leaders : note that at the same 
time, the energy-efficiency is optimized. The best reduction of 
the power consumption is achieved with the number of leaders 
is 5, the reduction of the power consumption is more than 16%. 
We observe that the total power reduction is maximum with the 
number of cognitive users maximizes the social welfare of the 
system. Finally, the figure ISlrepresents the improvement of the 
maximal social welfare (depending on the number of cognitive 



users) of the sensing game compared to the social welfare at 
the Nash equilibrium solution, depending on the load (K/N) 
of the system. The four curves correspond to different sensing 
cost a e {0%, 5%, 10%, 15%}. When the load approaches its 
maximal value 1/(3* + 1/N, the improvement of the social 
utility is greater than 100%. Then, we can conclude that 
our hierarchical framework with optimal number of cognitive 
equipments becomes more efficient in terms of social utility 
when a cognitive wireless network is high loaded. 

VI. Conclusion 

In this paper, we have introduced a new power control 
game where the action of a player is hybrid, one component 
of the action is discrete while the other is continuous. The 
first component is discrete since it corresponds to deciding 
whether to sense the radio environment or not ; the second 
component is continuous because it corresponds to choosing 
the transmit power level in an interval. Whereas the general 
study of hybrid games is of independent and game-theoretic 
interest and remains to be done, it turns out that in our case 
we can prove the existence of a kind of Braess paradox which 
allows us to restrict our attention to two separate games played 
consecutively : choosing the discrete and continuous actions 
jointly is less efficient than choosing them separately over 
time. The power control game is studied in detail and it 
is shown that there exists an optimal number of cognitive 
transmitters which maximizes the network utility, meaning 
that introducing too much cognition is not globally energy- 
efficient. This holds whether the cost of sensing is set to 
zero or not. From an individual point of view, the intuition 
which consists in saying that sensing is beneficial only if 
the sensing cost is acceptable, can be proved. As distributed 
networks are considered, global efficiency of the network is 
generally not guaranteed. Equilibria are indeed less energy- 
efficient (say in terms of sum utility) than the centralized 
solution. The (hierarchical) approach we propose can therefore 
be seen as a tradeoff in terms of global performance and 
required signaling. Conducting a refined analysis in terms of 
signaling for the power control problem would be relevant. 
On the other hand, the sensing game can be shown to have 
desirable properties like being weighted potential. This is a 
key property since many learning algorithms are known to 
converge in such games, proving that this decision can be 
learned over time with partial information only. Additionally, 
this game is shown to have a non-trivial set of correlated 
equilibria. These equilibria are very useful since they allow 
one to introduce some fairness among the transmitters and 
can be stimulated by a public signal incurring no cost in 
terms of extra signalling from the receiver ; in this respect 
the famous Nash bargaining solution (used in the wireless 
literature for having both a fair and cooperative solution, see 
e.g., II20I1I24I ) can be reached. This work therefore provides 
several new results of practical interest for cognitive wireless 
networks but, of course, the proposed concepts would need to 
be developed further to make them more appealing in terms 
of implementability. In particular, technical issues related to 
spectrum usage might be considered by introducing frequency 



10 



selective channels and the corresponding power allocation 
problem. Considering a more general structure of interference 
networks, the relevance of successive interference cancelation 
in terms of energy-efficiency might be assessed. Of course, 
classical issues such as the impact of channel estimation is 
also of practical interest, especially regarding to the fact that 
some learning algorithms are known to be robust against this 
type of errors. 

Appendix A 
Review of some game-theoretic concepts 

A. Correlated Equilibria 

In this section, we provide the definition of correlated 
equilibrium (CE). This equilibrium concept was introduced by 
Aumann in 1 3 1 and extends the concept of Nash equilibrium. 
Correlated equilibria are used in section IV-C3| in order to 
provide more fair equilibrium solutions. 

Definition A.l: A probability distribution Q G A(^) is a 
canonical correlated equilibrium if for each player i, for each 
action ai € Ai that satisfies Q{ai) > Q we have : 

2_^ Q{a-i \ ai)ui{ai,a-i) > 

^ Q{a-i\ ai)ui{h,a-i), Wbi £ At. (30) 

The result of Aumann 1987 IS states that for any correlated 
equilibrium, it corresponds to a canonical correlated equilib- 
rium. 

Theorem A.2 (Aumann 1987, prop. 2.3 [3 J): The utility 
vector u is a correlated equilibrium utility if and only if there 
exists a distribution Q E A(^) satisfying the linear inequality 



contraint (30i with u = EqU. 
B. Potentuuof the Sensing Game 

In this section, we provide the potential function of the 



sensing game presented in section IV 



Theorem A.3: The equilibria of the above potential game 
are the maximizers of the Rosenthal potential function [32^. 

{S={Ai,..., Ak)\S e NE} = arg max $(i^, L) 



arff max 



L 

E 

i=i 



■1=1 
UiNS.K-j,j) 



The proof follows directly from the one of Rosenthal's the- 
orem [32|. Let us simplify the expression of the potential 
function, which gives: 



HF,L) 



(1 



F L 

^ 9iRi Kn 



•T.'-^ 



+ E 



[(7V+/3*)(7^-J)+(i + l)/3*]7(V«) 
?1^ ^Ml (N'^NB*- 



{N^~NI3* 



[{N+p*)j+(K^j + l)P*]^*) 



Appendix B 
Proof of propositionIUJ 



In this section, we prove the proposition 3.4 In order to 
have good sensing capabilities, there exists a certain energy 
threshold fmin such that: 



a T min {gf/,pi>) > ^ 



a> 



Smir 

T min£g£ 



■) 



where T is the block duration, gfi is the channel gain between 
any leader i £ £ and the considered follower f £ F, pi is the 
power level of the leader I £ C and a is the sensing cost. The 
utility of the follower is maximized when the sensing cost a 
is minimal, that is a* 



Tiaine^cigfePi) ' 



^ax„elo,i]Uf{a) 



ur 



> 1 



(1-a*) 



RkKP* 



RkKP*) 



> 1 



1- 



6. 



'ip{N + (3*) 



> 1 



T min£g£ (gfePi) ^-^^{N + jI) 



f(7£) 



1 > 

(N + ll) 
'-^{N + lD 
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^n 



T min£g£ [gftPi) 



t&c 



f(7l 



T min (gfePe) ( 1 — -jrir^ 



>en 



This concludes the proof of Proposition |3.4| 

Appendix C 
Proof of TheoremI4.6I 



In this section, we characterize the equilibria of the 
2— player sensing game. The first important remark is that the 
Nash equilibrium utilities are always dominated by the Stack- 
elberg equilibrium utilities. This implies that the following 
equation holds for any parameters a > 0. 

C/i(NSi,S2)<C/i (81,82) 
f/2(Si,N82)< [72(81,82) 

Thus the action (81,82) is not an equilibrium of the game. 
To compute the equilibria of this game, it remains to compute 
the following differences: 

C/i(N8i,N82)-[/i(8i,N82) 
t/2(N8i,N82)-C/2(N8i,82) 

The above differences are equal and does not depend on 



a particular player We provide the proof of Theorem 4.6 
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Suppose first that condition (C2) is met. 



a < 

1- 



1 - /3*7* 



(1 - 7^^/?*) 



^(l-r)<(l-a)^ 

^W* <(^-") 



< 1 - a 

f(/3*) 1 - /3*7* 



1+7* 
cr2^*(l + 7*) 



The last inequality implies that the games has two pures equi- 
libria (NSi, S2), (Si, NS2) and one strictly mixed equilibrium 
{x*,y*) defined by the equations (25 1. If condition (CI) is 



satisfied, then the strategies (Si) and (S2) are dominated and 
then the game has one pure equilibrium (NSi,NS2) and if 
condition (C3) is met, the game has an infinite number of NE. 



Appendix D 
Mixed Nash Equilibria 

If condition (C2) is met, the sensing game has a strictly 
mixed equilibrium. In this section, we provide a characteri- 
zation of the mixed equilibrium strategies {x*,y*) using the 
indifference principle. 



i?igif(r)(i-r) „ , i?igif(7*)(i~7*r) ,, „ ^ 
fliffif(/3*)(i-7*r) ^ 
fiiffif(/3*)(i-r) (, „ s 



which is equivalent to: 

.i?iSif(/3*)(l-/3* 



y2- [- 

+(!-«) 
= (1-a) 
Then we obtain: 

y2 = 



(l-a) 



i?igif(r)(l-7*/3*) 



cr2/3* '^ "' cr2/3*(l + 7*) 

i^igif(r )(i - /3*) fligif(7*)(i - 7*r ) . 

(t2/3* cr27*(l + /3*) ^ 

Eigif(r)(i - r) i?i5if(7*)(i - 7*n 



cr2/3* 



(l-a)^|^(l-/3*) 



a27*(l + /?*) 



7* l+,9* 



With X . (1 - ayipii ~n-'^'-^ + '^{i- 

j3*) — (1 — a)-^^jpr- 7+ * ■ Replacing the above 2/2 into the 



indifference equation, we obtain the utility of player 1 at the 
mixed equilibrium. 



.M!) 



M!), 



Ui{x2,y2) 



R191 /(i-a)^(i-r)^(i-r) 



X 



7* 1+/3 



^(l-«)^ 



1+7* 



X 



With X = (1 - a)^(l - r) - ^i-?^ + ^(1 - 



\M1 

I 13" 
,\([3') 1-7-/3* 



r)-(i-«)^^+,. 

i?252 /(i-«)^(i-r)^(i-r) 



The same argument applies: 



U2{X2,V2) = 



f(7*) l-7*/3 
7 



f(r) l-7*/3* 



-7f i-l A H.P ; 1-7 P 

1+/32 V-^ "-' l3* 1+7* 



Appendix E 
Proof of TheoremI4.2I 



The proof of Theorem 4.2 uses the corollary 2.9 of the 



article |25| (see also the theorem 3.1 in I33J ). 

Theorem E.l: The game G is a potential game if and only 
if for every player i,j £ JC, every pair of actions Si,ti G Ai 
and Sj,tj S Aj and every joint action Sk € AK\{i,j}, we 
have that 

Ui{ti,Sj,Sk) — Ui{si,Sj,Sk) + Ui{si,tj,Sk) — Ui{ti,tj,Sk) + 
Uj{ti,tj,Sk) - Uj(tt,Sj,Sk) + Uj{si,Sj,Sk) - Uj{st,tj,Sk) = 

Let us prove that the two conditions provided by our theorem 
are equivalent to the one of corollary 2.9 in 1251 . We introduce 
the following notation defined for each player i E K, and each 
action T e A. 



M» 



RiQi 



and U {ti,tj,Sk) 



Mj 



For every player i,j G /C, every pair of actions Si,ti e Ai 
and Sj,tj e Aj and every joint action Sk € S'x\{i.j}, we have 
the following equivalences: 

UUNSi,Sj , Sfc) - UdSi,Sj,Sk) + UUS^. NSj ,Sk) ~ ^.(NSi, NS, , Sfc) = 
Uj{NS,,Sj.Sk)-Uj{NS,.NSj,Sk) + U,{S„NS,..Sk)~Uj{S,,S,,Sk) 

Ifii = p,j 
ULiF + l,L + l)^ Ul(F.L + 2) 
+ (1 - a){UF{F + 1, L + 1) - Uf{F + 2, L)) = 

Thus the sensing game is a potential game if and only if one 
of the two following conditions is satisfied : 



Ul{F 



V(i,j) e /C, R,gi = RjQj, 
hL + l)-UL{F,L + 2) = 

{l-a)[UF(F + 2,L) ~ Uf(F- 



Appendix F 
Proof of TheoremI4.4I 



1,L + 1)]. 



The proof of this theorem 4.4 follows the same line of the 



proof in App. IE] It suffices to show that the auxiliary game 
defined as follows is a potential game. 



G={K,{A)^eK,{U^UK) 



(31) 



Where the utility are defined by the following equations with 

A*4 — -i^- 



^ i \^i^ ^ — i) 



Ui{si,s^i) 



A*i 



(32) 



From the above demonstration, it is easy to show that, for 
every player i,j e /C, every pair of actions Si.ti G Ai and 
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■ Convex Hull of the Ulility Regior 
I Achievable Utility Region 
" Pareto-Optlmal Correlated Equi 
I Pure Actions Profiles 




— ©- Improvement of t 


esocial utility fora = 0% 


— «»— Improvement of t 


esocial utility fora = 5% 




esocial utility fora = 10% 




esocial utility fora = 15% 




utility or player 1 (bit/J) 



Fig. 1. The figure depicts, for a 2-transmitter scenario, the region of achiev- 
able utihties of the sensing game where each transmitter decides whether to 
sense of not to sense. The figure also shows the different equilibrium points 
of the game. One of the messages of this figure is the interest in terms of 
fairness in stimulating a correlated equilibrium instead of a Nash equilibrium. 
In particular a Nash bargaining solution can be obtained. 




af leaders among 17 players 



Fig. 2. This figure represents the relative gain (in %) in terms of individual 
energy-efficiency obtained by equipping F = K — L transmitters with a 
cognitive radio. For typical scenarios, we see that maximizing the number of 
cognitive transmitters is not optimal. On the other hand, if there is only one 
cognitive radio (F = 1 or L = 16, as assumed in |22|) one can degrade the 
individual performance for a typical value for the sensing cost (5% of the 
time- slot is spent for sensing). 



Sj^tj £ Aj and every joint action Sk E AK\{ij}, we have the 
following equality: 

Ui{ti,Sj,Sk) — Ui(si,Sj,Sk) + Ui{si,tj,Sk) — Ui{ti,tj,Sk) = 
Uj{ti,Sj, Sk) — Uj{ti,tj,Sk) + Uj[si,tj,Sk) — Uj{si,Sj,Sk). 

Using Corollary 2.9 in 1*251, we conclude that the sensing game 
is a weighted potential game. 



Fig. 3. This figure represents the improvement in terms of utility sum or 
network energy-efficiency at the Stackelberg equilibrium compared to the 
case of Nash equilibrium (i.e., no transmitter is equipped cognitive radio) 
with K = n players, N = 128, the efficiency function is f{x) = 

e a: with r = 3 bit/s per Hz and for different numbers of leaders. 
The different curves con'esponds to different values for the sensing cost : 
a e {0%, 5%, 10%, 15%}. When 5% of the time has to be spent for sensing, 
the network energy-efficiency can be improved by 13% whereas it is 17% 
when this cost is close to zero. 




Fig. 4. From previous figures, we know that equipping the network with an 
adequate number of cognitive radios can significantly improve the network 
energy-efficiency. It turns out that not only efficiency is maximized by doing 
so but that the total consumed power is also reduced. Scenario illustrated : 

2^-1 

K = n players, N = 128, the efficiency function is f{x) = e ^ with 
r = 3 for different numbers of leaders. 



mprovement o 
mprovement o 

/aximjm load 


ttie social utlity 
tfie social utility 
tfie social utility 
ttte social utility 






K =1 37 

r =3 






Fig. 5. The objective of this figure is to show the relationship between 
the maximum gain obtained in terms of network energy-efficiency and the 
spectral efficiency in terms of user load {K/N) (the spectral efficiency in 
terms of channel coding is always fixed and set to r = 3 bit/s per Hz). From 
this figure, it is seen that there is a strong interest in operating at a load level 
close to the maximum limit tolerated by the system (ensuring the existence 
of an equilibrium, see Sec. |II-C^. 
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